Crystal structure of cd147 extracellular region and use thereof

ABSTRACT

A crystal, a preparation method and 3D structure of CD147 extracellular region are provided. Such 3D structure is useful in the determination of the active site of CD147 extracellular region by computer modeling or molecular docking method. The crystal and/or 3D structure are useful in a structure-based drug design and the selection of an antibody, a ligand or an interacting molecule of CD 147 extracellular region.

FIELD OF THE INVENTION

The present invention relates to the crystal of CD147 extracellular region. The invention also relates to the steric structure of the crystal determined by X-ray diffraction and use thereof. More specifically, the invention relates to the use of the steric structure in the computational molecular modeling for determining the three dimensional structure and the active sites of a complex of CD147 extracellular region with its corresponding antibody HAb18, ligand or interacting molecule, and also to the use of the computational molecular modeling for identifying a protein, a peptide or a chemical, including an inhibitor, that can bind to CD147 extracellular region. The invention also relates to the identification of a CD147 inhibitor and the medical use thereof.

BACKGROUND OF THE INVENTION

CD147 is a widely expressed transmembrane glycoprotein, which has extensive physiological and pathogenic significance, and its primary functions comprise inducing the secretion of extracellular matrix metalloproteinase (EMMPRIN), interacting with CypA, mediating an inflammation, facilitating viral infection to host cell, etc. CD147 is closely related to embryo development, tumor invasion and metastasis, occurrence and development of inflammation, viral infection and proliferation, etc.

CD147 is widely expressed in hematopoietic and non-hematopoietic cells, such as hematopoietic cells, epithelial cells, endothelial cells and lymph cells. It is a transmembrane glycoprotein with a molecular weight of 50-60 KD. HAb18G/CD147, a highly glycosylated transmembrane protein, is a new member of the CD147 family and a novel liver cancer related membrane antigen. The anti-hepatoma monoclonal antibody HAb18 developed and purified by the inventors' lab was used to screen the cDNA library of human liver cancer cell, and a cDNA fragment (about 1.6 kb) corresponding to antigen HAb18G was cloned. After searching for the Genbank, the cDNA sequence of antigen HAb18G was found to be highly homologous to the base sequence of human CD147 molecule. The further analysis on the open reading frames showed that both proteins were encoded by the same gene. Based on this finding, we had proven the identity of these two molecules at protein level in different aspects. A further study showed that CD147 was an inducer for matrix metalloproteinases (MMPs) on tumor cell surface, and could stimulate the synthesis of matrix metalloproteinases through fibroblast. HAb18G/CD147 was assumed to possess the EMMPRIN function of CD147 molecule. In the previous study, using a tumor bearing mice model, we had proven that different doses of iodine-131 labeled metuximab monoclonal antibody injection led to different tumor suppression effects, and the tumor suppression of both medium and high doses was significantly different with the negative control group. Subsequently, a monoclonal antibody was labeled with ¹³¹I to prepare ¹³¹I labeled metuximab monoclonal antibody injection (LICARTIN), which could be used safely and effectively to treat primary liver cancer. Another clinical research showed that LICARTIN could be used as an anti-recurrence drug for liver cancer after the liver transplantation. After 1 year follow-up, the patients with liver transplantation in the treated group had decreased recurrence rate and increased survival rate comparing to the control group. The above studies demonstrated that CD147 molecule is a novel drug target for the treatment of tumors such as liver cancer.

We studied various tissue profiles of CD147 by using the antibody HAb18, demonstrating that the CD147 molecule was highly expressed (69.47%) in a cancer tissue originated from the epithelium, but not expressed in benign tumors, and expressed at a lower level in embryonic tissue and normal tissue (2.67% and 10.62%, respectively). It is a broad-spectrum cancer-specific tumor marker.

Human CD147 is a protein consisting of 269 amino acids with a molecular weight of up to about ˜28 KD, and belongs to type I transmembrane protein family. From the cloned cDNA sequence, the mRNA of CD147 molecule was about 1.7 kb in length. No TATA box or CAAT box was found in the region associated with the transcription start site, but the transcription start site was located within CpG islands, especially in the region of nucleotide −247 to nucleotide +6. There is a non-coding region of about 115 nucleotides before the N-terminal start codon, and the coding region encodes 269 amino acids, in which 21 amino acids form a signal peptide, 185 amino acids in the middle form an extracellular domain, 24 amino acids of positions 206 to 229 form a transmembrane region, and 39 amino acids at C-terminus form an intracellular domain. 4 extracellular cysteines (C⁴¹, C⁸⁷, C¹²⁶, C¹⁸⁵) form two disulfide bonds, constituting a typical IgSF hemispherical domain. In addition, there are three similar N-glycosylated asparagine sequences in extracellular region, and the glycosylation determines the MMP activating activity of CD147 molecule. The purified deglycosylated CD147 molecule cannot induce the activity of MMP, and antagonize the activity of natural CD147 molecule. Endo-F glycosidase digestion reduces the M.W. of CD147 molecule by about 30 KD, demonstrating that the glycosylation of CD147 molecule is primarily in the form of N-linked oligosaccharide.

The 24 amino acid residues in the transmembrane region are highly conserved in the CD147 molecules of human, mouse and chicken, indicating that the transmembrane fragment of CD147 molecule plays an important function and displays similar functions in various species [19]. There exists a charged glutamic acid residue in the transmembrane region, which is uncommon in other membrane protein molecules, indicating that CD147 molecule can associate with other transmembrane proteins [20]. The transmembrane region contains 3 leucine residues (L²⁰⁶, L²¹³, L²²⁰) and one phenylalanine residue, which occur once every 7 residues and is a typical leucine zipper structure. The charged residues and the leucine zipper structure in the transmembrane region are a potential protein interaction motif, which is very likely to mediate the participation of CD147 molecule in the generation of a signal transduction polypeptide chain or a component of a membrane transporter protein.

Currently there is no a comprehensive analysis for the cytoplasmic domain. Schlosshauer-B et al. had discovered the coexistence of CD147 and F-actin by double-labeling method, and found that the expression of CD147 on the membrane surface was related to microfilament proteins in cytoskeleton. Whether there is a phosphorylation site or how a signal is transduced is still unknown.

CD147 gene is encoded by 8 exons and is of 10.8 kb in total length. The nucleotide and protein sequences are shown in FIG. 1.

Exon1 (aa 1˜23, 107 bp) and Exon2 (aa 2˜75, 154 bp) are separated by Intron1 (about 6.5 kb) which is the biggest intron sequence in EMMPRIN gene. Intron2 is about 700 bp in length, and is the second biggest interfering sequence in said gene. Exon3 (aa 76˜102, 83 bp) and Exon4 (aa 103˜148, 138 bp) are separated by Intron3 (300 bp). Intron4 is about 650 bp in length. Exon5 (aa 149˜240, 276 bp). Intron5 is about 550 bp in length. Exon6 (aa 241˜249, 25 bp) is very short. Intron6 is about 250 bp in length. Exon7 (aa 250˜269, 69 bp). Intron7 is about 300 bp in length. The last exon is Exon8 which is 736 bp in length. Exon1 encodes 5′-untranslated region (5′-UTR) and a signal peptide. Exon2 and Exon3 encode the first Ig1 domain, in which Exon2 encodes 52 codons which are about 66% of Ig1, and Exon3 encodes 27 codons which are about 34% of Ig1. Exon4 and Exon5 encode the second Ig domain, in which Exon4 encodes 46 codons which are about 45% of the domain. Exon5 is a “binding” exon, encoding the rest 55% of the Ig domain, the 24 amino acid residues in the transmembrane domain and a small part of the intracellular domain. Exon6 and Exon7 encode the intracellular domain, and Exon7 also encodes the stop codon and 5 nucleotides in 3′-UTR. Exon8 encodes the rest of 3′-UTR.

CD147 molecule is a potential adhesion molecule, the function of which is similar to those of N-CAM, I-CAM and other relevant IgSF subgroup molecules, involving in adhesion between cell and cell or cell and matrix. There were experiments to prove that CD147 molecule can form a protein complex with α3β1 or α6β1 of the Integrin family. The functions of the complex are still unknown, possibly relating to the adhesion between tumor cells and extracellular matrix, and between tumor cells and interstitial cells. There were other experiments demonstrating that some CD147 monoclonal antibodies can suppress the homotypic aggregation of estrogen dependent breast cancer cell lines MCF-7 and MDA-435, and the adhesion of MCF-7 cells to Type IV collagen, FN or LN. CD147 is a novel cell surface adhesion molecule which mediates cell adhesion. The expression of this kind of molecules and their functions in tumor is a focus in current tumor researches.

The biological function of a protein is largely dependent on its spatial structure, and the variety of protein structure conformations result in different biological functions. The relationship between the structure and the function of a protein is the basis for protein function prediction and protein design. A protein molecule can exhibit its specific biological activity only in its specific 3D spatial structure. A slight damage to the spatial structure probably leads to the reduction or even the loss of the biological activity of the protein. The specific structure allows the protein to bind to its specific ligand molecule, e.g. the binding of oxygen to hemoglobin or myoglobin, an enzyme to its substrate molecule, a hormone to its receptor, or an antibody to its antigen. If the code of a gene is known, scientists can deduce the amino acid sequence of the encoded protein, but cannot figure out the spatial structure of the protein. Along with the development of structural biology in recent years, using x-ray diffraction or NMR analysis, the steric structures of many proteins had been discovered by 3D structure and molecular design techniques. The understanding to the spatial structure of a protein would contribute to the determination of the protein's function. Also, if a protein is a target of a drug, combining the knowledge of the gene code and the structure information of the protein, a small molecule compound can be designed to inhibit the protein associated with a disease so as to treat the disease.

The illustration of CD147 protein structure would play an important role in the disease treatment or the diagnostic agent designing. Prior to the present invention, the structure of CD147 and the mechanism of CD147 to promote tumor progression, regulate tumor invasion or metastasis, mediate inflammation, or facilitate viral infection to host cell remain vague. Therefore, despite the knowledge about the general function and role of CD147 is known, the development of an agent for a disease (e.g. liver cancer) treatment or diagnosis is restricted due to the absence of the protein structural information.

Thus, there is a need to clarify the three dimensional structure of CD147 molecule and establish a corresponding model, which would contribute to the determination of the active sites of CD147 molecule interacting with a molecule such as an integrin, CypA, so as to utilize the structure and model for assisting the disease treatment, e.g. the structure-based drug design.

The term “steric structure of a protein” as used herein refers to three dimensional structure of a protein determined by amino acid sequence of the protein under certain conditions, namely the three dimensional structure formed by the folding of a protein with an amino acid sequence under certain conditions. X-ray diffraction or NMR can be used to determine the structure of a protein.

SUMMARY OF THE INVENTION

The present invention relates to the 3D coordinates and the 3D structure of the crystal of CD147 extracellular region with P4₁2₁2 space group in tetragonal crystal system and an amino acid sequence as shown in SEQ ID NO: 1, to the structural model and active interacting sites of a complex of CD147 extracellular region with its antibody HAb18 determined by said structure of CD147 extracellular region, and to use of the structures, models and active interacting sites. It is well known in the field of protein crystallography that whether a crystal with high resolution diffraction quality is a bottleneck for resolving protein structure. A stable disulfide bond is required in many proteins for the correct spatial structure. Without such disulfide bond, these proteins may be degraded or become inclusion bodies. The redox potential in E. coli is the main cause why a protein cannot fold correctly, and a disulfide bond can be formed in a protein only after the protein is transported to periplasmic space. We used Origami B (DE3) strain with double mutations in glutathione reductase (gor) and thioredoxin reductase (trxB), capable of increasing the formation of disulfide bonds in the cytoplasm, to generate more soluble and active protein. The study had also shown that even though the overall expression levels were similar, the level of the active protein from Origami (DE3) was 10 times higher than that from other stains. This is a possible reason why CD147 is mainly expressed in a form of inclusion body in the common-used BL21 (DE3), whereas in Origami (DE3) a soluble active protein can be obtained. Furthermore, before the present invention, the crystal of CD147 extracellular region with enough quality for 3D structure resolution was not obtained. Thus, before the present invention, the 3D structure of CD147 extracellular region had not been determined, let alone the active interacting sites of CD147 extracellular region. The present inventors clarify the 3D structure and the active interacting sites of CD147 extracellular region for the first time, and also provide a 3D model for the drug design to CD147.

Accordingly, one aim of this invention is to provide a crystal with enough quality for determining 3D structure of CD147 extracellular region. The invention also relates to a method for growing the crystal of CD147 extracellular region.

The second aim of this invention is to provide the 3D structure and model information of CD147 extracellular region.

The knowledge about the 3D structure of CD147 extracellular region provides a means (i.e. structure-based drug design) to design and produce an agent such as an antibody, a peptide, a protein, or a small molecule (including a chemical) for regulating tumorigenesis or tumor development and inhibiting tumor metastasis in an animal (including human). For example, using various computer softwares and models, the agent can be designed to inhibit the biological activities of CD147 molecule so as to inhibit tumor progression, the occurrence of inflammation, viral infection to host cell and so on.

Accordingly, the third aim of this invention is to provide a method comprising utilizing the obtained 3D structural information of CD147 extracellular region in combination with computer modeling to establish the structure of HAb18, an anti-CD147 monoclonal antibody, using such as computer simulation and molecular docking to determine the active sites of CD147 extracellular region, and designing an agent for disease treatment or diagnosis based on the amino acid sequences and the structures of the active sites.

Accordingly, the fourth aim of this invention is to utilize the 3D structure of CD147 extracellular region to design and produce an agent such as an antibody, a protein, a peptide, a small molecule (a chemical) and so on, capable of binding to CD147 extracellular region and inhibiting or stimulating the biological activities of CD147. The inhibitory or stimulatory agent can be determined according to the following: (a) providing the 3D structure of CD147 extracellular region, (b) designing an agent such as an antibody, a protein, a peptide or a small molecule (a chemical) etc, by using the structure (c) synthesizing the agent, and (d) evaluating the change of CD147 mediated embryo development, tumor invasion or metastasis, occurrence or development of inflammation, viral infection or proliferation, etc., caused by the synthesized agent.

DESCRIPTION OF THE DRAWINGS

FIG. 1: The organization of exons/introns in CD147 gene.

FIG. 2: The crystal structure of the CD147 extracellular region. FIG. 2A describes the space group of CD147 extracellular region, indicating one asymmetric single unit, wherein there are four molecules indicated by A, B, C and D chains, respectively, each molecule or each chain containing two Ig-like domains. FIG. 2B is an enlarged image of one chain, wherein two β-sheets in two different types of Ig-like domains are indicated (composed of different numbers and orientations of β-sheets, respectively). [1] Chothia C, Jones E Y. The molecular structure of cell adhesion molecules. Annu Rev Biochem. 1997; 66:823-62. [2] Vaughn D E, Bjorkman P J. The (Greek) key to structures of neural adhesion molecules. Neuron. 1996 February; 16(2):261-73.

FIG. 3: The complex of CD147 with its monoclonal antibody HAb18.

FIG. 4: The active binding sites of CD147.

FIG. 5: The amplification of the mutated fragment using high-fidelity polymerase HS-Primestar. From left to right: Marker2000; Blank; P2 (Primers M5&F3); Full-length CD147 (Primers F5&F3); P1 (Primers F5&M3).

FIG. 6: Sequencing result.

FIG. 7: SDS-PAGE result. From left to right: Marker; uninduced CD147; induced CD147; uninduced Mut-147; induced Mut-147.

FIG. 8: Western blot result. From left to right: Induced CD147; Induced Mut-147. It can be seen that the affinity of the mutated Mut-147 to the anti-CD147 monoclonal antibody HAb18 is clearly decreased, demonstrating that the three amino acids E28, T30 and D44 are important in the binding of CD147 to the monoclonal antibody HAb18.

DETAILED DESCRIPTION OF THE INVENTION

The invention relates to the crystal of CD147 extracellular region and 3D structure thereof, to a model of the 3D structure, to a structure model and active interacting sites of a complex of CD147 extracellular region with its antibody HAb18 (determined by the structure of CD147 extracellular region), to a drug design method using the 3D structure, to an agent such as an antibody, a protein, a peptide, or a small molecule, identified by the method, to use of the agent in the disease treatment or diagnosis, and to use of the structure, model, and active interacting sites of CD147 extracellular region. It should be understood that the term as used herein such as “homolog”, “fragment”, “variant”, “analogue” or “derivative” of the amino acid sequence of the crystal of the CD147 extracellular region, as mentioned in the appended claims, refers to a molecule obtained by substituting, varying, altering, replacing, deleting or adding one or more amino acid residues in the amino acid sequence of crystallized CD147 extracellular region of the invention, or the term may also refer to a molecule with a sequence within the amino acid sequence of crystallized CD147 extracellular region according to the invention.

The “variant” as used herein refers to the addition, deletion or replacement of an amino acid residue in the sequence of wildtype CD147 extracellular region or a fragment thereof. The variant of the amino acid sequence of the CD147 extracellular region according to the invention may comprise (1) the deletion of any one or more amino acid residues in the sequence, or (2) the replacement of any one or more amino acid residues in the sequence with one or more amino acid residues.

The term “homolog” as used herein refers to similarity in amino acid sequences, namely comparing the amino acid sequence of the crystal of CD147 extracellular region of the invention with another amino acid sequence, if at least 70% between the two amino acid sequences are identical, they are considered as homologs. The comparison method can be carried out manually or using a computer software such as CLUSTAL for homologous alignment.

The term “fragment” as used herein refers to any fragment of the amino acid sequence of the crystal of CD147 extracellular region provided in the invention, and this fragment is able to be crystallized.

The term “analogue” as used herein refers to a compound with a similar amino acid sequence to that of the crystal of CD147 extracellular region provided in the invention, and the similar amino acid sequence contains an amino acid replacement or deletion that does not affect the crystallizability of the amino acid sequence.

The term “derivative” as used herein refers to an amino acid sequence obtained by chemical modification in the amino acid sequence of the crystal of CD147 extracellular region of the invention, such as the substitution of hydrogen in alkyl group, acyl group or amino group, etc.

The crystal of CD147 extracellular region according to the invention refers to not only the naturally occurring crystal of CD147 molecule or the crystal obtained by crystallization of wildtype CD147, but also the crystal of a mutant of wildtype CD147 that have same 3D structure as wildtype CD147 molecule. A mutant of wildtype CD147 molecule may be: substitution of at least one amino acid in wildtype CD147 molecule, addition/deletion of an amino acid in a peptide chain of wildtype CD147 molecule or addition/deletion of an amino acid at the N- or C-terminus of a peptide chain of wildtype CD147 molecule.

In one embodiment, the crystal of CD147 extracellular region has a cubic crystal cell with following crystal lattice constants: a=126.481 Å±0.5%, b=126.481 Å±0.5%, c=169.926 Å±0.5%, α=β=γ=90°. Its amino acid sequence is set forth in SEQ ID NO: 1. Moreover, the invention provides a method for preparing the crystal of CD147 extracellular region, comprising the following steps: cloning the gene sequence for CD147 extracellular region into the prokaryotic expression system pET21a(+), expressing it in Origami B (DE3) strain, purifying the CD147 extracellular region protein, preparing a solution of the CD147 extracellular region protein in 5-20 mg/ml and a reservoir solution of 0.5M ammonium sulfate, 0.1M sodium citrate and 1.0M lithium sulfate, pH 5.6, mixing the solution of the CD147 extracellular region protein with the reservoir solution, and standing the mixture solution for a period of time until a crystal of CD147 extracellular region forms and grows to the predefined size or even bigger.

In another embodiment, X-ray diffraction is adopted to determine the 3D structure of the crystal of CD147 extracellular region, comprising: resolving the structure by using single wavelength anomalous dispersion (SAD). One set of diffraction data for the crystal of the natural protein (2.8 Å resolution) and one set of diffraction data for the crystal of the seleno-substituted protein (3.1 Å resolution) are collected from each of BL17A and NW12 beamlines of the synchrotron radiation light source (Photon Factory, Japan), and the software package HKL2000 is used to process the data. Firstly the data of the crystal of seleno-substituted protein is used to obtain phase information and initial structure model, and then the diffraction data of the crystal of the natural protein is used for structure determination and refinement.

A suitable computer modeling program such as WAM (Web Antibody Modeling) can be used in simulating calculation of a 3D structure of CD147 extracellular region that can substantially satisfy the definition of claim 1. This simulating calculation requires some information like: (1) the amino acid sequence of CD147 extracellular region; (2) the amino acid sequences of the parts related to 3D structure; and (3) specific 3D structural information. The 3D structure of other forms of CD147 extracellular regions (a mutant, a fragment, a derivative, a variant, an analogue or a homolog of CD147 extracellular region) essentially matching with the 3D structure of CD147 extracellular region according to the present invention can also be calculated using a molecular replacement method, which is described in detail below.

Table 1 gives a 3D structure of CD147 extracellular region which is suitable for simulating or calculating the 3D structure of another CD147 extracellular region. According to the invention, using the above 3D structure, persons skilled in the art can simulate or calculate the 3D structure of a mutant, a fragment, a derivative, a variant, an analogue or a homolog of CD147 extracellular region that has a similar amino acid sequence as CD147 extracellular region described in the invention. These techniques are based on the information obtained from the analysis of the crystal of CD147 extracellular region. Therefore, the determination of the 3D structure of the crystal of CD147 extracellular region enables one to apply conventional techniques in the art to derivatize the 3D structure or model for a mutant, a fragment, a derivative, a variant, an analogue or a homolog of CD147 extracellular region. The derivatization of any structure of a mutant, a fragment, a derivative, a variant, an analogue or a homolog of CD147 extracellular region can even be accomplished in the absence of the structural data in respect of the crystal thereof. Moreover, using the information obtained from the crystal of CD147 extracellular region in the invention, it is possible to optimize the simulation of the 3D structure of a newly mutant, fragment, derivative, variant, analogue or homolog of CD147 extracellular region when working on these crystal structures. One advantage of the invention is that, in the absence of other crystal structural data of a mutant, a fragment, a derivative, a variant, an analogue or a homolog of CD147 extracellular region, given the difference between the amino acid sequences between the mutant, the fragment, the derivative, the variant, the analogue or the homolog of CD147 extracellular region and the CD147 extracellular region according to the present invention, it is possible to simulate the 3D structure of the mutant, the fragment, the derivative, the variant, the analogue or the homolog of CD147 extracellular region. In addition, the invention makes the determination of the active sites of CD147 extracellular region for an antibody, an integrin, CypA or another interacting molecule, the structure-based drug design and the drug screening realizable. An antibody, a protein, a peptide or a small molecule (a chemical) thus designed and screened out can effectively affect the activity of CD147 molecule.

The crystal model of CD147 extracellular region according to the invention can contribute to the selection of an antibody, a ligand or another interacting molecule of CD147, and the determination of the active interacting sites, especially the selection of an inhibitory molecule such as an antibody, a peptide, a protein or a chemical molecule. For example, a method for co-crystallization of CD147 extracellular region with an antibody, a ligand or another interacting molecule can be used; an antibody, a ligand or another interacting molecule can be dissolved in the crystal of CD147 extracellular region; a computer can be used to dock the model of the crystal of CD147 extracellular region with the model of the crystal of an antibody, a ligand or another interacting molecule so as to screen a ligand for CD147, especially an inhibitory molecule of CD147; or a computer can be used to design and screen out a reagent, an antagonist or a drug that can bind to the active binding sites of the crystal of CD147 extracellular region. Before the discovery of the 3D structure of the invention, no information can be used for the structural development of a diagnostic or therapeutic compound based on CD147 extracellular region structure. Up to now it is impossible to perform this design only based on linear amino acid sequence. Structure-based drug design refers to using computational simulation to predict the interaction of a protein conformation with an antibody, a peptide, a polypeptide, a protein or a chemical substance. Generally, for a protein that can effectively interact with a therapeutic antibody, a peptide, a protein or a chemical molecule, the 3D structure of the therapeutic compound is supposed to have a compatible conformation to ensure the binding. The knowledge of the protein 3D structure enables persons skilled in the art to design a diagnostic or therapeutic antibody, peptide, protein or chemical molecule with similar compatible conformation. For example, the information of the binding sites of CD147 extracellular region to its antibody HAb18 enables persons skilled in the art to design an antibody, a peptide, a protein or a chemical that can bind to, and inhibit the biological activities of CD147.

The determination of the 3D structure of CD147 extracellular region provides the important information for the determination of possible active sites in CD147 extracellular region. The structural information is helpful for designing an inhibitor of CD147 molecule for the active sites of the molecule. For example, the computer technique can be used to identify a ligand that can bind to an active site or design a drug, or X-ray crystal diffraction analysis can be used to identify and locate the binding sites of a ligand.

Greer et al. used repeated sequences of computer model, structure of protein-ligand complex and X-ray diffraction method to design an inhibitor of thymine nucleotide. Therefore, an inhibitor of CD147 molecule can also be designed as such. For example, using the 3D structure of CD147 extracellular region according to the invention, by a computer modeling, we can design an agent such as an antibody, a protein, a peptide or a small molecule capable of binding to the functional active sites or other sites in the 3D structure according to the invention, synthesize the agent, form a complex of the agent with CD147, and then use X-ray crystal diffraction to analyze the complex to identify the actual binding sites. Based on the result of X-ray crystal diffraction, the structure and/or functional groups of a ligand can be adjusted accordingly until an optimized agent molecule is obtained.

Moreover, based on the result of the 3D structure of CD147 extracellular region, many computer softwares can be used in inferential drug design so as to design an inhibitor of CD147 molecule. For example, automated ligand-receptor docking softwares can be used (Jones et al. in Current Opinion in Biotechnology, Vol. 6, (1995), 652-656). Detailed and precise 3D structural information of CD147 extracellular region is necessary in the method.

The drug design by linking fragments also requires precise 3D structural information of the target receptor. This method is to determine the sites of several ligands for binding to the target molecule, and then construct a molecular scaffold linked to a ligand. Thus, ligands are linked to form a potential directing complex, and then the iterative technique is used to confirm. The inhibitor of CD147 can also be designed as such.

In the above described methods of structure-based drug design, it is necessary to determine an agent that can interact with a target biological molecule first. Sometimes, this agent can be found in literature. However, most of the inhibitors for a target molecule are unknown, or a novel inhibitor for a target molecule is desired. For this, one need first search a database (e.g. the Cambridge Structural Database) for compounds that can interact with active sites or sites of the target molecule. Where the structure of a target molecule is unknown, the searching is generally based on a pharmacokinetic characteristic, such as metabolic stability or toxicity. However, the determination of the structure of CD147 extracellular region makes it possible to screen based on the structure and characteristic of the active sites of the molecule. The screening can be based on whether a potential inhibitor can form an effective 3D Pharmacophore with CD147 extracellular region.

In one embodiment of the invention a method comprising using computer-assisted 3D modeling, molecular docking technique selection and the determination of an antibody, a ligand or another interacting molecule of CD147 extracellular region is provided, comprising: (1) providing a protein structure comprising a 3D structure or model of the CD147 extracellular region according to the invention, and predicting the structure of a possible antibody, ligand or interacting molecule by utilizing computer 3D modeling; (2) docking the 3D structures of CD147 extracellular region and the antibody, the ligand or the interacting molecule; (3) evaluating whether the 3D structure of the antibody, the ligand or the interacting molecule can bind to the 3D structure of the active sites of CD147 extracellular region, and further analyzing comprising (4) analyzing the biological activities of the antibody, ligand or the interacting molecule of CD147 extracellular region for CD147; (5) whether the antibody, ligand or the interacting molecule of CD147 extracellular region can regulate the biological functions of CD147.

In another embodiment of the invention, a computer-assisted method for structure-based drug design is provided, comprising: (1) providing a protein structure comprising a 3D structure or model of the CD147 extracellular region according to the invention; (2) designing an antibody, a peptide, a protein or a compound by using the 3D structure or model; (3) synthesizing the antibody, the peptide, the protein or the compound.

The ligand, interacting molecule or inhibitory antibody, peptide, protein, or chemical according to the invention can be identified using various methods known to persons skilled in the art, e.g. a ligand, an interacting molecule or an antibody, a peptide, a protein, or a chemical can be bound to or interacted with CD147 extracellular region protein, determined in CD147 protein in solution or on cell, and screened and identified by using such as an immunoassay like enzyme-linked immunosorbent assay (ELISA) or radioimmunoassay (RIA), a binding assay like Biacore, yeast two hybrid, phage peptide library or an antibody library.

The following examples are used to illustrate the invention in more details. These examples are provided to explain but not limit the invention.

Example 1 Generation of the Crystal of CD147 Extracellular Region and Resolution of the Structure of the Crystal

1. Cloning the gene sequence for CD147 extracellular region into the prokaryotic expression system pET21a(+), expressing it in Origami B (DE3) strain, and purifying CD147 extracellular region.

2. The CD147 extracellular region has the sequence as shown in SEQ ID NO: 1.

3. Crystallizing CD147 extracellular region.

4. Resolving the structure of the crystal.

Single wavelength anomalous dispersion (SAD) is used to resolve the structure. One set of diffraction data for the crystal of a natural protein (2.8 Å resolution) and one set for the crystal of a seleno-substituted protein (3.1 Å resolution) are collected from each of BL17A and NW12 beamlines of the synchrotron radiation light source (Photon Factory, Japan), and the software package HKL2000 is used to process the obtained data. First the data of the crystal of the seleno-substituted protein are used to get phase information and initial structure model, and then the diffraction data of the crystal of the natural protein are used for structure determination and refinement. The space group of the crystal is P4₁2₁2, and there are four molecules in one asymmetric unit, the crystal lattice parameters being: a=126.481 Å, b=126.481 Å, c=169.926 Å, α=90°, β=90°, γ=90°.

The Detailed Steps

1. Preparation and Purification of CD147 Extracellular Region

1) Vector: pET21a(+), NdeI/XhoI cloning sites

2) Host cell: Origami B (DE3)

3) Expression: 1 mM IPTG induction, 20° C., overnight

Purification

1) HiTrap Q Hp, 20 mM Tris pH 8.0 w/wo 1M NaCl

2) Mono Q HR 10/10, 20 mM Tris pH 8.0 w/wo 1M NaCl

3) HiLoad 16/60 Superdex 75 prep grade, 20 mM Tris pH 8.0, 150 mM NaCl

2. Protein Crystallization

Hanging-Drop Vapor Diffusion Procedure

1) Protein concentration: 5-20 mg/ml

2) Reservoir solution: 0.5M ammonium sulfate, 0.1M sodium citrate and 1.0M lithium sulfate, pH 5.6.

3) Growth period: 3-4 weeks, 4° C.

3. Structure Determination

X-Ray Diffraction and Data Collection

Single wavelength anomalous dispersion (SAD) was used to resolve the structure. One set of diffraction data for the crystal of a natural protein (2.8 Å resolution) and one set for the crystal of a seleno-substituted protein (3.1 Å resolution) were collected from each of BL17A and NW12 beamlines of the synchrotron radiation light source (Photon Factory, Japan), and the software package HKL2000 was used to process the obtained data. First the data of the crystal of the seleno-substituted protein were used to obtain phase information and initial structure model, and then the diffraction data of the crystal of the natural protein were used for structure determination and refinement (Table 1, FIG. 2).

Example 2

CD147 extracellular region and antibody HAab18 modeling and docking, the determination of the active sites of CD147 extracellular region, and experimental confirmation.

1. The structure of the complex formed by the CD147 extracellular region and its antibody had following characteristics: there were 3 peptide chains in the molecular docking model of CD147 extracellular region and HAb18 molecule, namely the CD147 extracellular region and the variable domains of the light and heavy chains of HAb18; 6176 atoms were present, and 6635 covalent bonds were formed; and the docking energies were assessed as: binding energy=101.21, docking energy=29.05, intermolecular energy=30.54, torsional energy=70.67, internal energy=−1.49.

2. The active structural sites of the CD147 extracellular region had following characteristics: the active sites of CD147 extracellular region were located in the membrane-distal loop region of C2-set domain at N-terminus of the protein, and the amino acid residues involving in epitope were: Glu28, Thr30, Asp44, Ala45, Leu46, Pro47, Gly48, Lys50 and Glu52, wherein Glu28, Thr30 and Asp44 were more important, namely they formed hydrogen bonds with CDRH3, CDRH2 and CDRL3 in HAb18.

3. Glu28, Thr30 and Asp44 of the CD147 extracellular region were mutated, and by Western-blot, the three amino acids were proven to be involved in the site of CD147 for binding to anti-CD147 monoclonal antibody HAb18, and also played important role in the site.

The Detailed Steps

1. WAM (Web Antibody Modeling) was used to perform homology modeling for HAb18, and G-factors overall value is −0.21 by using Procheck V3.4. DeepView/Swiss-pdbViewer V3.7 was used to perform Energy Minisation and Compute Energy (GROMOS96 43B1 parameters set) for the HAb18 homology modeling result, and the total energy was determined as −5888.806 KJ/mol.

2. DeepView/Swiss-pdbViewer V3.7 was used to revise the spatial structure of HAb18/CD147 complex, add polar and nonpolar hydrogens, and correct the charge numbers of atoms.

3. AutoDock V3.0.5 was used for docking HAb18 with CD147 extracellular section, and the direction for molecular docking was roughly determined. Based on this, the precision was increased and a refine docking was performed. Then the docking model (FIG. 3) with most rational energy evaluation was determined. DeepView/Swiss-pdbViewer V3.7 was used to perform Energy Minisation and Compute Energy (GROMOS96 43B1 parameters set) for the docking result, and the total energy of the complex formed between HAb18 and CD147 extracellular region was −15383.64 KJ/mol.

4. IFR Contact of STING MILLENNIUM was used to analyze the docking model to determine the amino acid residue sites for interaction (FIG. 4).

5. Experimental Confirmation for the Active Sites.

After several blinding docking, the model with the lowest docking energy was then performed for energy minimization and molecular dynamics using Insight II 2005, to obtain an optimized docking model, which showed that the interaction between the antibody and the antigen was mainly in two peptide chains (A₂₆TEVTG₃₁ and D₄₄ALPGQ₄₉), wherein E28, T30 and D44 were more important, namely they formed hydrogen bonds with CDRH3, CDRH2 and CDRL3 in HAb18. Therefore the mutations in the overall sequence should be E49A, T51A and D65A. As for the bases, the mutations were: 1 GAG- - -GCG; 2 ACA- - -GCA; 3 GAC- - -GCC. Thus, the primers were designed as follow:

FULL-147 5′ (SEQ ID NO: 2) 5′CTGAATTCCATATGGCTGCCGGCACAGTCTTC3′ FULL-147 3′ (SEQ ID NO: 3) 5′GCGCTCGAG GTGGCTGCGCACGCGGAGCG3′ Mut-147 3′ (SEQ ID NO: 4) 5′ CACGCCCCCCTTCAGCCAGCGGTGCCCTGCGACCGCTGT 3′ Mut-147 5′ (SEQ ID NO: 5) 5′CTGGCTGAAGGGGGGCGTGGTGCTGAAGGAGGCCGCG3′

By PCR, P1 fragment was obtained using primers FULL-147 5′ and Mut-147 3′, and P2 fragment was obtained using primers Mut-147 5′ and FULL-147 3′. The products P1+P2 were annealed, and the primers FULL-147 5′ and FULL-147 3′ were used again to generate a full-length mutated Mut-147 for the prokaryotic expression of the extracellular region. The expressed product was finally identified with SDS-PAGE and Western-blot (FIG. 5-8).

It can be seen that, the affinity of the mutated Mut-147 with anti-CD147 monoclonal antibody HAb18 is clearly decreased, demonstrating that the three amino acids E28, T30 and D44 play important role in the interaction between CD147 and the monoclonal antibody HAb18.

Result:

TABLE 1 CD147 extracellular region crystal diffraction data Seleno-substituted Protein crystal Protein Crystal Data collection Space group P4₁2₁2 P4₁2₁2 Cell dimensions a, b, c (Å) 126.481, 126.481, 125.141, 125.141, 169.926 169.660 α, β, γ (°) 90, 90, 90 90, 90, 90 Resolution (Å) 50 − 2.80 (2.90 − 50 − 3.10 (3.21 − 2.80) ^(a) 3.10) R_(sym) or R_(merge) ^(b) 0.106 (0.586) 0.114 (0.697) I/σ I 18.4 (2.2) 38.1 (5.8) Completeness (%) 99.4 (100) 99.7 (100%) Redundancy 7.3 (7.4) 28.5 (29.6) Refinement Resolution (Å) 50 − 2.8 No. reflections 32613 R_(work) / R_(free) ^(c) 0.255/0.296 No. atoms 5527 Protein 5527 B-factors 30.603 Main chains 29.279 Side chains 32.049 R.m.s deviations Bond lengths (Å) 0.010 Bond angles (°) 1.397 Ramachandran statistics Most favoured regions 82.7 (%) Additionally allowed 15.2 regions (%) Generously allowed 1.8 regions (%) Disallowed regions (%) 0.3  ^(a)Numbers  in  parentheses  refer  to  the  highest  resolution  shell ${{{}_{}^{}{}_{}^{}}\left( R_{merge} \right)} = {\sum\limits_{hkl}{\sum\limits_{i}{{{{I_{i}({hkl})}_{i} - {\langle{I({hkl})}\rangle}}}/{\sum\limits_{hkl}{\sum\limits_{i}{{Ii}({hkl})}}}}}}$  ^(c)R = ∑F_(o) − F_(c)/∑F_(o)   

1. A crystal or model of CD147 (HAb18G/CD147, Basigin, EMMPRIN, M6, Neurothelin) extracellular region, wherein the crystal has P41212 space group in a tetragonal crystal system and an amino acid sequence as shown in SEQ ID NO: 1, and the model has a three dimensional (3D) structure as in table
 1. 2. The crystal or model of CD147 extracellular region of claim 1, wherein the crystal has a cubic crystal cell with the following crystal lattice constants: a=126.481 Å±0.5%, b=126.481 Å±0.5%, c=169.926 Å±0.5%, a=β=γ=90°, and there are four molecules in one asymmetric unit wherein the molecular weight of one molecule is about 20±0.5 KD, and solvent content is about 70±1%.
 3. The crystal or model of CD147 extracellular region of claim 1, wherein the model has 3D spatial structural characteristics of the crystal of CD147 extracellular region, and the structure of the crystal of CD147 extracellular region shows that the CD147 extracellular region contains two Ig-like domains, wherein the N-terminal domain is Ig C2-set, whereas the C-terminal domain close to the cell membrane belongs to Ig I set.
 4. The crystal or model of CD147 extracellular region of claim 1, wherein the sequence of CD147 extracellular region is of human origin.
 5. The crystal or model of CD147 extracellular region of claim 1, wherein the CD147 extracellular region in the crystal of CD147 extracellular region comprises the sequence as shown in SEQ ID NO: 1 or a homolog, fragment, variant, analogue or derivative thereof.
 6. The crystal of a complex of CD147 extracellular region with anti-CD147 antibody HAb18, an integrin, a ligand or another interacting molecule.
 7. The crystal of claim 6, wherein the CD147 extracellular region in the crystal of the complex of CD147 extracellular region with anti-CD147 antibody HAb18, integrin, a ligand or another interacting molecule is the CD147 extracellular region defined in any one of claims 1-4.
 8. A complex of CD147 extracellular region with anti-CD147 antibody HAb18 determined by computer modeling docking, wherein there are 3 peptide chains in a molecular docking model of CD147 extracellular region and monoclonal antibody HAb18 molecule, the CD147 extracellular region, HAb18VL and HAb18VH chains of HAb18; 6176 atoms are present and 6635 covalent bonds are formed, and the docking energies: binding energy=101.21, docking energy=29.05, intermolecular energy=30.54, torsional energy=70.67, and internal energy=−1.49.
 9. A 3D spatial structure determined by the crystal or model of CD147 extracellular region of any one of claims 1-4 or of the crystal of a complex of CD147 extracellular region with a ligand of CD147 molecule of claim 6, wherein the CD147 extracellular region is: (1) the full-length wildtype CD147 extracellular region or a mutant, fragment, derivative, variant or homolog thereof, or (2) a mutant, a fragment, a derivative, a variant or a homolog of wildtype CD147 extracellular region.
 10. The 3D spatial structure of claim 9, wherein an active site of CD147 extracellular region for binding to an antibody, a ligand or an interacting molecule of CD147 extracellular region is evaluated by computer modeling or other methods using the 3D structure of either (1) the full-length wildtype CD147 extracellular region or a mutant, fragment, derivative, variant or homolog thereof, or (2) a mutant, a fragment, a derivative, a variant or a homolog of wildtype CD147 extracellular region.
 11. The 3D spatial structure of claim 10, wherein the active site of CD147 extracellular region is positioned in the membrane-distal loop region of C2-set domain at N-terminus of the protein, and the amino acid residues involved in an epitope are: Glu28, Thr30, Asp44, Ala45, Leu46, Pro47, Gly48, Lys50, and Glu52.
 12. A method for identifying a compound that can bind to CD147 extracellular region, comprising doping candidate small molecule compounds with CD147 crystal and allowing co-crystallization, and screening candidate ligands or antagonists by using a method for measuring intermolecular interaction selected from Biacore, yeast two hybrid, phage peptide library or an antibody library, and comparing, designing and docking the 3D structures of CD147 crystal and a candidate ligand by computer modeling.
 13. A method for selecting and determining an antibody, a ligand or an interacting molecule of CD147 extracellular region by using computer-assisted 3D modeling and molecular docking technique, comprising: (1) computer modeling the 3D structures of CD147 extracellular region and a candidate antibody, ligand, or interacting molecule as described in claim 9 or 10, (2) docking the 3D structures of CD147 extracellular region and the antibody, the ligand or the interacting molecule, and (3) evaluating whether the 3D structures of the antibody, ligand or interacting molecule can bind to an active site of CD147 extracellular region.
 14. The method of claim 13, further comprising: (4) analyzing the biological activities of an antibody, a ligand or an interacting molecule of CD147 extracellular region and CD147, (5) evaluating whether the antibody, the ligand or the interacting molecule of CD147 extracellular region can regulate the biological function of CD147.
 15. The method of claim 14, wherein an antibody, a ligand or an interacting molecule that are designed and developed based on the crystal structure of CD147 extracellular region is used as a reagent, an antagonist or a drug.
 16. A computer-assisted drug design method for a biological active substance based on its structure, comprising: a. providing a protein model of CD147 extracellular region, wherein the model exhibits substantially a 3D structure of claim 1, b. designing an antibody, a peptide, a protein, or a small molecule by using the model, c. synthesizing the antibody, the peptide, the protein or the small molecule.
 17. The method of claim 16, wherein the methods further comprises: d. evaluating the biological activities of the synthesized antibody, peptide, protein or small molecule.
 18. The method of claim 17, wherein the designing step comprises screening with a computer one or more database of chemical compounds, wherein the 3D structures of the compounds are known.
 19. The method of claim 18, further comprising interacting the compound screened in the screening step with the computer model.
 20. The method of claim 17, wherein the designing step comprises targeted drug designing or random drug designing.
 21. The method of claim 17, wherein the designing step comprises screening out those compounds predicted to be capable of binding to the 3D structure of CD147 extracellular region.
 22. The method of claim 17, wherein the biological activities refer to binding to the CD147 protein, inhibiting or stimulating the activity of the CD147 protein.
 23. (canceled)
 24. (canceled)
 25. The crystal or model of CD147 extracellular region of claim 1, wherein the crystal of CD147 protein is prepared by the following method: cloning the gene sequence for CD147 extracellular region into the prokaryotic expression system pET21a(+), expressing and purifying CD147 extracellular region; providing a solution of CD147 extracellular region at 5-20 mg/ml, preparing a pool solution of 0.5M ammonium sulfate, 0.1M sodium citrate and 1.0M lithium sulfate, pH 5.6; mixing the solution of CD147 extracellular region with the pool solution; standing the mixture solution for a period of time until a crystal of CD147 extracellular region grows to the predefined size or larger. 